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Abstract— Early identification of plant diseases is crucial as they can hinder the growth of their respective 
species. Although many machine learning models have been utilised for detecting and classifying plant 
diseases. The advent of deep Learning, a subset of machine learning, has revolutionised this field by offering 
greater accuracy. Therefore, deep learning has the potential to greatly enhance the accuracy of plant disease 
detection and classification. Recent research progress on the use of deep learning technology in the 
identification of crop leaf diseases is reviewed in this article. The current trends and challenges in plant leaf 
disease detection using advanced imaging techniques and deep learning are presented. This survey aims to 
provide a valuable resource for the researchers investigating the detection of plant diseases and detection of 
those using state of the art models for ease of saving time and cost. Additionally, the article also addresses 
some of the current challenges and issues in the detection process that need to be resolved. 

Keywords— Plant Disease Detection, Deep Learning, Survey, Convolutional Neural Network, 
Agriculture. 


I. INTRODUCTION Plants with a disease typically have noticeable 


Plants are the producers and the most important part 
in the food chain after the sun. Plant health is an 
important consideration for the environment as well 
as for the food safety and food security of the world 
population. In fact it is closely linked to the “one 
health” concept [1], which addresses the ways of 
fighting the health issues of humans and animals and 
environmental issues and controlling the spread of 
diseases [2]. Considering the importance of plant 
health the United Nations declared the year 2020 as 
the International Year of Plant Health [3]. In their 
findings, Serge Savary et. al. estimated the loss of yield 
due to plant disease for five major crops viz. wheat, 
rice, maze, potato and soybean ranges from 8.1% to 
41.1% [4]. If the global economy is considered then 
plant diseases cost over 220 billion dollars per annum 
[5]. Thus disease control and mitigation is of utmost 
importance. To do so, it is necessary to understand 
and classify diseases properly. 
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stains or lesions on their leaves, shoots, fruits or 
flowers. The majority of diseases and pest conditions 
exhibit a distinct visual pattern that can be utilised to 
specifically identify irregularities. Most disease signs 
may first develop on the leaves of plants, which are 
typically the main source for identifying plant 
illnesses [6]. Here is an example of some images in Fig. 
1 from various datasets [7], [8], [9], [10] showing 
various plant diseases. 

On-site identification of diseases of plants is 
typically done by agricultural experts, or by farmers 
using their own knowledge. This approach is not only 
arbitrary, but also arduous, time-consuming, and 
ineffective. Inexperienced farmers are more likely to 
make mistakes and utilise medications carelessly 
when making identifications. | Environmental 
contamination brought on by quality and output will 
result in avoidable financial losses. In order to 


overcome these difficulties, the application of 
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techniques for image processing for identifying plant 
diseases has emerged as a popular study area. 


Fig. 1: Some images from various datasets of various plant diseases. From [10] 1.a: Grey Leaf Spot, 1.b: Northern Leaf Spot, 
1.c: Northern Leaf Blight. From [9] 1.d: Coffee Blister Spot, 1.e: Rice Leaf Scald, 1.f: Cashew Powdery Mildew. From [7] 1.9: 
Cherry Powdery Mildew, 1.h: Strawberry Leaf Scorch, 1.1: Peach Bacterial Spot. From [8] 1.j: Tomato Septoria Leaf Spot, 
1.k: Potato Early Blight, 1.1: Grape Leaf Black Rot. 


While using tiny data sets and creating 
theoretical conclusions, traditional image processing 
algorithms produced acceptable results and 
performance for plant disease identification using leaf 
pictures. Deep learning is being vividly used for script 
identification [30-38] and also for human disease 
detection [39-44]. Deep learning has revolutionised 
the field of computer vision, specifically in the field of 
object detection and image classification. Deep 
learning along with transfer learning is now regarded 
as a promising tool to enhance the ability of plant 
disease detection systems in order to achieve better 
results, widen the scope of disease detection, and 
implement a useful real-time system for identification 
of plant diseases. 

There are plenty of reviews and survey 
articles available [45-53] but this article surveys the 
most recent advancement in the field of plant disease 
detection using various deep learning techniques. To 
track this recent advancement, articles that are openly 
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accessible and published in 2022 and 2023 have been 
selected as references. 

This paper contains a total of 5 sections. 
Section 2 introduces the various datasets used by the 
articles which are under the survey. The section 3 
presents the surveys of 15 selected articles on recent 
advancement of plant disease detection. The next 
section discusses the future scope available on the 
topic. The last section provides a conclusion. 


Il. DATASET 

Typically, for deep learning dataset comprises 
three subsets: the training set, validation set, and test 
set. The training set facilitates the learning process of 
the model, while the validation set is commonly 
utilised to fine-tune hyperparameters during the 
training phase. On the other hand, the test set contains 
data samples that the model has not previously 
encountered, and it serves as a means to assess the 
performance of the deep learning model. In this 
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section, the available datasets and how they have been 
developed i.e., the source of the images are discussed. 

Khan et al. [11] used a dataset [12]. Six distinct 
diseases for cucumber leaf, including downy mildew, 
powdery mildew, mosaic, anthracnose, angular spot, 
and blight, are included in this dataset. Initially, each 
class comprises 100 to 150 photos, along with them 
they created a straightforward method for data 
augmentation that consists of four operations: vertical 
and horizontal flip, rotation of 45 and 60 degrees. The 
number of photos in each class is increased to 2000 by 
using this approach, which is applied to each class of 


cucumber illness. This enhanced dataset is used to 
train deep models in subsequent rounds. 

A well known dataset called “PlantVillage” 
which was originally published as [14] but later 
republished in a paper and available as [7]. This 
dataset contains a total of 54303 images that includes 
images of 14 different plants and 38 different diseases. 
Here is some example shown in Fig. 2 collected from 
a dataset. The article [13] used 2152 images of 3 classes 
of potato leaves taken from [14] and 1700 images self 
collected of two potato leaf disorders. This dataset is 


also used as a part or as a whole in the articles [15], 
[16], [17], [18], [58], [59]. 


Fig. 2: Some example images of - 2.1.a: Healthy tomato leaf, 2.1.b: Tomato leaf with Leaf Mold, 2.1.c: Tomato leaf with Early 
Blight, 2.1.d: Tomato leaf with Mosaic virus, 2.2.a: Healthy apple leaf, 2.2.b: Apple leaf with Cedar Apple Rust, 2.2.c: Apple 
leaf with Black Rot, 2.2.d: Apple leaf with Apple Scab, 2.3.a: Healthy grape leaf, 2.3.b: Grape leaf with Esca (Black Measles), 
2.3.c: Grape leaf with Black Rot, 2.3.d: Grape leaf with Leaf Blight, 2.4.a: Healthy corn leaf, 2.4.b: Corn leaf with Common 
Rust, 2.4.c: Corn leaf with Gray Leaf Spot, 2.4.d: Corn Leaf with Northern Leaf Blight, 2.5.a: Health Peach Leaf, 2.5.b: Peach 
leaf with Bacterial spot, 2.5.c: Healthy potato leaf, 2.5.d: Potato leaf with Late Blight. 


Along with [7], the article [15] used datasets 
[19], [20], [21], [22] to create a dataset consisting 58 
plant disease classes and one no-leaf class. [23] used a 
custom made dataset [24] having three classes of corn 
diseases. A guava leaf dataset [26] of four disease 
classes is used by [25]. In the article [27] about wheat 
diseases, the authors used their own dataset of 19160 
images in five different classes, a small part of which 
is available at [28]. 
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Along with the dataset PlantVillage dataset 
[14], the article [16] used Plantdoc dataset [8], 
Digipathos dataset [9], NLB dataset [29] and CD&S 
dataset [10]. CD&S dataset is a custom dataset of 
images acquired from the Purdue Agronomy Center 
for Research and Education (ACRE) consisting of 
three classes of diseases: Northern Leaf Blight, 
Northern Leaf Spot and Gray Leaf Spot. 
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Algani et al. [54] used a dataset for citrus fruits 
and leaves [60]. Yong et al. [55] used another dataset 
[61] of oil palm seedlings for their work. Ma et al. [56] 
created their own dataset collecting images from Jilin 
Academy of Agricultural Sciences. Guerrero-Ibafiez 
and Reyes-Mufioz [57] used a public dataset [62] of 


tomato leaves with 11000 images of 10 categories. 
They have also added 2500 images of their own 
collection. 

The consolidated summary of the available 
plant disease datasets are tabulated in Table 1. 


Table 1: Summary of the state of the art plant disease datasets. 


Author(s) Year Dataset specification 

Zhang et al. [12] 2017 Six classes of cucumber leaf. 

Hughes and Salath [14] 2015 38 disease classes of 14 different plants. 

J and Gopal [7] 2019 38 disease classes of 14 different plants. 

Singh et al. [8] 2020 17 classes of disease of 13 different plants. 

Barbedo et al. [9] 2018 171 disease class and 21 different plants. 

Ahmed [10] 2021 3 classes of corn disease. 

Hu et al. [19] 2019 3 classes of tea leaf disease. 

Kour and Arora [20] 2019 2 classes with 16 subclasses of 8 different 
plants. 

Krohling et al. [21] 2019 Healthy and diseased Arabica coffee leaves 

Parraga-Alava et al. [22] 2019 Healthy and diseased Robusta coffee leaves 

Ahmad et al. [24] 2021 3 classes of corn disease. 

Rajbongshi et al. [26] 2022 Images of guava diseases of six classes. 

Long et al. [28] 2022 999 wheat disease images with five classes. 

Wiesner-Hanks et al. [29] 2018 Images of northern leaf blight of maize. 

Rauf et al. [60] 2019 Citrus fruits and leaves dataset. 

Azmi et al. [61] 2020 Oil Palm Seedlings images. 

Bet al. [62] 2020 10 classes of tomato leaves including healthy 
leaves. 


HI. STATE-OF-THE-ART METHODS 

In this section, recent studies that employ 
popular machine learning architectures for 
identifying and classifying leaf diseases are presented. 
Additionally, some related works are discussed which 
introduce the modified or improved versions of deep 
learning architectures to achieve better results. 

This study of Khan et al. [11] proposes an 
Entropy-ELM-based system for deep learning to 
identify illnesses of cucumber leaves. Pre-trained deep 
VGG16, ResNet50, ResNet101 


DenseNet201 are trained in the suggested framework, 


models: and 
and one of them is chosen based on accuracy. This 
model is then used to select the best features using the 


suggested Entropy-Elm technique. The feature 
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selection strategy is applied in the step opposite, 
which involves fusing the characteristics of all pre- 
trained models. The final stage combines the features 
from the previous two phases to perform 
classification. Using a dataset of enhanced cucumber 
leaves, the proposed framework was tested, and its 
accuracy was 98.48%. In this article total nine 
classifiers are used among them there are four types of 
SVM: Linear, Cubic, Quadratic and MG SVM and five 
types of KNN: Fine, Weighted, Subspace, Cosine, 
Cubic and Medium KNN. Each 


performance is calculated using a variety of metrics, 


classifier's 


including F1-Score, precision rate, recall rate, time, 
and accuracy. 
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Mahum et al. [13] proposed a model that uses 
the efficient DenseNet 201 architecture. This contains 
an extra transition layer than the original DenseNet 
architecture. This improves the compactness and 
reduces the computation load. The cross entropy loss 
function is reweighted by the architecture to address 
the problem of class imbalance inside the dataset. The 
limited size of the training and testing images lets the 
model identify illnesses in potato leaves effectively 
and efficiently. Due to the use of an additional 
transition layer and preprocessed images, the system 
also achieves 97.2% accuracy while being 
computationally quick. 


In the article by Pandian et al. [15] a deep 
convolutional neural network with 14 layers (14- 
DCNN) has been proposed. To get a balanced dataset 
along with various public dataset, image 
augmentation processes like deep convolutional 
generative adversarial network, neural style transfer 
and basic image manipulation were used. The coarse- 
to-fine searching strategy with random search were 
used to enhance the proposed DCNN model's training 
performance and to choose the most appropriate 
hyperparameter values. The training and validation 
accuracy of the 14-DCNN model were 99.993% and 
99.985 %, 
convolutional and pooling operations in the 
suggested 14-DCNN than there are in transfer 


learning approaches, the training time was shorter 


respectively. Since there are less 


than that of the transfer learning techniques. 
Divyanth et al. [23] used three semantic 

SegNet, UNet and 

DeepLabV3+ in two stages. Stage one is used to extract 


segmentation models: 
the leaf image from the complex background and 
stage two is used for detection. They have compared 
the segmentation models by their performance and 
found UNet performed better for stage one and 
DeepLabV3+ model in the stage two. They have also 
calculated the severity of the disease by calculating the 
area of the disease lesions with improved results. 
Nandi et al. [25] have used five CNN models: 
VGG-16, GoogleNet, ResNet-18, MobileNet-v2 and 
Efficient Net. They have applied model quantization 
techniques on above CNN models and found that 
GoogleNet achieved the lowest size with 97% 
accuracy. The EfficientNet model achieved 99% 
accuracy with reasonably low size after quantization. 
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Long et al. [27] used RMSProp optimizer 
while training their model CerealConv which gave a 
classification accuracy of 97.05%. When compared to 
trained pathologists on a sample of the bigger 
dataset's photos, the model produced an accuracy 
score that was 2% higher. 

Algani et al. [54] used CNN with Ant Colony 
Optimization (ACO-CNN). In their study the ACO- 
CNN model outperformed the C-GAN, CNN, and 
SGD models in terms of accuracy, precision, recall, 
and Fl-score. The accuracy rates for C-GAN, CNN, 
and SGD are 99.6%, 99.97%, and 85%, respectively. 
The F1 score has attained the greatest rate compared 
to other models since the accuracy rate in the ACO- 
CNN model is 99.98%. 

Yong et al. [55] worked particularly for the 
detection of Basal Stem Rot. They presented 
hyperspectral imaging and a deep learning based 
approach. The method involves dividing the 
seedling's top-down view into the regions and 
analysing spectral changes across leaf positions. 
Segmented images of the plant were generated to 
assess the impact of background images on detection 
accuracy using a Mask Region-based Convolutional 
Neural Network (RCNN). They trained their system 
using VGG16 and Mask RCNN and obtained the 
highest precision of 94.32% using VGG16. 

Ma et al. [56] extracted multidimensional 
features from both spatial and channel perspectives 
using an attention module that was integrated into the 
cross-stage partial network backbone. Additionally, 
they incorporated a spatial pyramid pooling module 
that utilises dilated convolutions into the network to 
expand the range of crop-disease-related information 
collected from images of crops. Their proposed model 
CCA-YOLO obtained an average precision of 90.15%. 

Guerrero-Ibanez and Reyes-Mufioz [57] 
designed a CNN-based architecture that incorporates 
GAN (Generative Adversarial Network)-based data 
augmentation techniques for early identification and 
classification of diseases in tomato leaves. They 
achieved a highest accuracy of 99.64% in disease 
classification. 

Saeed et al. [58] discussed the identification of 
tomato leaf diseases by categorising images of healthy 
and unhealthy tomato leaves utilising the pre-trained 
CNNs - Inception V3 and Inception ResNet V2. They 
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trained these models using a public dataset known as 
PlantVillage and obtained a highest accuracy of 
99.22% in the validation. 

Joshi and Bhavsar [59] used standard deep 
learning models to classify nine categories of leaf 
diseases. They also developed a CNN framework to 
classify the same. Compared with the standard 
models they obtained better results in their developed 
model. They reported the highest classification 
accuracy of 95%. 

Ahmad et al. [16] assessed the effectiveness of 
five standard deep learning models in identifying 
plant diseases across diverse environmental 
conditions. These models were trained using corn 
disease images of public datasets. They observe that 
using DenseNet169 yielded the highest generalisation 
performance for identifying plant diseases, achieving 
validation accuracy of 81.60%. 

A fine-tuning method to the developed CNN 
models was discussed in [17] to classify tomato leaf 
disease. Authors performed a _ hyperparameter 


optimization using the particle swarm optimization 


algorithm (PSO). The weights of these architectures 
are optimised using grid search optimization. They 
also proposed a triple and quintuple ensemble model 
and classifies the datasets using a cross-validation 
approach. Using the ensembles method they reported 
the highest classification accuracy of 99.60%. 

Francis et al. [18] described the application of 
standard deep learning models in agriculture for 
automatically generating features and developing a 
predictive system. The authors emphasised the 
importance of segmentation of diseased areas, transfer 
learning, and fine-tuning the model. They initially 
trained on a dataset of healthy and diseased apple 
leaves and evaluated the performance of multiple 
MobileNet models with varying depth and resolution 
multipliers. They obtained a highest accuracy of 
99.7% using the combination of Mobilenet and K 
means clustering method. 

In Table 2 the 
improvements in the techniques of plant disease 


chronological major 


detection and classification is presented. 


Table 2: Notable improvement in plant leaf disease detection and classification. 


Author(s) age Method Result Remarks 
Year 
Entropy-ELM is used for 
Khan et al. [11] Jan’ 2022 Entropy-ELM 98.4% feature selection. Classification 
done using F-DenseNet201 
Reduced the impact of class 
Mahumerar tis | Apr oo | enn 97.2% TEA a 
DenseNet201 reweighted cross-entropy loss 
function. 
Classification 
Pandian et al. [15] | Jul’ 2022 14-DCNN pea ee Optimised the value of the 
and Precision hyperparameter 
99.79% 
For estimating 
Divyanth et al. Aug’ 2022 SegNet, UNet and disease severity, | Disease severity estimation 
[23] DeepLabV3+ R? value obtained | done. 
= 0.96 
VGG-16, GoogleNet, 
ResNet-18, 
MobileNet-v2 and GoogleNet 97% 
Nandi et al. [25] Sep’ 2022 Efficient Net with Ef a 99% Model optimization used. 
model quantization 
techniques 
, CerealConv with a Used masked images to verify 
PONG rire aan RMSProp optimizer AER the working of the model. 
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Obtained better results than C- 
i ; ” 2022 ACO- .98% 
Algani et al. [54] Dec’ 202 CO-CNN 99.98% GAN, CNN and SDG models. 
VGG-16 and Mask 
Yong et al. [55] Dec’ 2022 ee 94.32% Hyperspectral imaging used. 
RCNN 
Dual-attention module used 
Ma et al. [56] Jan’ 2023 CCA-YOLO 90.15% with the CSPNet backbone 
network 
G -Ibañ GAN based dat 
uerrero anez CNN with GAN ase ; ata 
and Reyes-Muñoz | Jan, 2023 99.64% augmentation techniques 
data augmentation 
[57] used. 
I tion V3 and 
Saeed et al. [58] | Jan’ 2023 H Ha yo |9922% Transfer learning used. 
hi and Bh 
i tang Bnavsar | Jan’ 2023 | Night-CNN 95% It is relatively quick. 
VGG16, ResNet50, 
. Average 
TE prana; eneralised Generalised performance 
Ahmad etal. [16] | Jan’ 2023 | DenseNet169, and | 8°™ P 
Xcep- testing accuracy computed. 
— of 81.60% 
tion 
Ulutas and Feb’ 2023 Ensemble CNN 99.60% Tari swarm optimization 
Aslantaş [17] algorithm used. 
Four variants of 
.6% without K- 
, ; MobileNet models erie ee With K-means and without K- 
Francis et al. [18] Feb’ 2023 : . means, 99.7% j 
with and without K- ; means algorithms compared. 
. with K-means 
means algorithm. 


IV. FUTURE SCOPE 
e In future, features can be improved using the 
algorithm and the 
EfficientNet deep model can be implemented for 
Graph CNN and 
reinforcement learning can also be applied to get 


Butterfly metaheuristic 


plant disease detection. 


better results. 
illness 
detection, activity and gesture recognition in 


e Many domains, including human 


security systems, and other plant disease 
detection issues, can use the Efficient DenseNet 
201 model with certain adjustments to its 
With the 


parameters it might be possible to reduce the 


architecture. adjustment of the 
number of training images and training time with 
similar or higher accuracy. 

e The use of the 14-DCNN model can be extended 
to analyse disease severity and disease detection 
using other parts of a plant. 

e By measuring the percentage of impacted regions 
and recommending necessary corrective actions, 
DL models can be expanded to anticipate severity. 
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e High accuracy can be achieved for real field plant 
images with diverse backgrounds. 

e Mobile application based plant disease detection 
systems can be achieved which can run with low 
hardware resources and with fast detection 
abilities. 


V. CONCLUSION 

This study highlighted and analysed various 
methodologies based on performance, datasets, plant 
leaf patterns, and diverse classes of disease. It also 
analysed the limitations of the state of the art and 
directed towards the potential improvement. The 
study's conclusion highlights the significance of 
incorporating computer vision, machine learning, and 
deep learning into automated devices such as smart 
mobiles in modern agriculture. In future research, 
attention should be given to expanding the disease 
detection system from laboratory settings to field 
conditions to maintain high accuracy in identification 
and prioritising research on novel image processing 
to facilitate the 


algorithms segmentation and 
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extraction of leaf lesion features in complicated 
scenarios. 
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